Optimal  Adaptive  Waveform  Selection 
for  Target  Detection 


B.  F.  La  Scala 
Dept,  of  Electrical 
and  Electronic  Engineering 
University  of  Melbourne 
Melbourne,  Australia 
E-mail:  bfls@ieee.org 


W.  Moran 
Dept,  of  Electrical 
and  Electronic  Engineering 
University  of  Melbourne 
Melbourne,  Australia 
E-mail:  b.moran@ee.mu.oz.au 


R.  J.  Evans 
Dept,  of  Electrical 
and  Electronic  Engineering 
University  of  Melbourne 
Melbourne,  Australia 
E-mail:  r.evans@ee.mu.oz.au 


Abstract — Modern  phased  array  radars  are  able  to  adaptively 
modify  their  performance  to  the  environment.  To  make  full  use  of 
this  capability,  scheduling  algorithms  need  to  be  designed.  This 
paper  poses  the  problem  of  adaptive  waveform  scheduling  for 
detecting  new  targets  in  the  context  of  finite  horizon  stochastic 
dynamic  programming.  The  result  is  a  scheduling  algorithm  that 
minimises  the  time  taken  to  detect  new  targets,  detecting  these 
targets  in  accordance  with  importance,  while  minimising  the  use 
of  radar  resources. 

I.  Introduction 

Modern  phased  array  radars,  with  flexible  waveform  gen¬ 
eration  and  beam  steering  capability,  are  able  to  adaptively 
modify  their  performance  to  suit  a  variety  of  environments. 
This  power  has  not  yet  been  fully  exploited,  in  part  because 
of  the  lack  of  suitable  scheduling  algorithms.  This  paper 
describes  an  optimal  waveform  selection  algorithm  for  the 
detection  of  new  targets.  It  does  not  examine  the  problem 
of  maintaining  tracks  on  established  targets.  In  the  rest  of 
this  paper  we  will  use  the  term  “target  detection”  to  refer 
to  the  identification  of  new  targets,  rather  than  the  detection 
on  subsequent  scans  of  targets  already  under  track. 

Phased  array  radars  can  direct  their  beam  in  any  direction 
without  inertia.  Thus  the  radar  can  switch  between  the  tasks  of 
tracking  existing  targets  and  acquiring  new  targets  essentially 
instantaneously.  Such  a  radar  thus  achieves  the  multi-mission 
capability  of  target  acquisition  and  target  tracking,  unlike  typ¬ 
ical  mechanically  scanned  radars.  This  flexibility  also  allows 
the  system  designer  to  consider  the  task  of  searching  for  new 
targets  separately  from,  and  independently  of,  that  of  updating 
established  tracks,  even  if  the  same  radar  is  performing  both 
tasks. 

In  general,  a  waveform  can  be  tailored  to  achieve  good 
Doppler  or  good  range  resolution,  but  not  both  simultaneously. 
This  is  a  problem  in  heavy  clutter  environments,  typified  by  an 
airborne  radar  seeking  to  detect  slow  moving  ground  targets  or 
by  a  littoral  radar  attempting  to  detect  submarine  periscopes  in 
the  presence  of  sea  clutter.  In  both  cases  the  part  of  the  return 
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ascribable  to  the  clutter  can  be  orders  of  magnitude  larger 
than  that  from  the  target  or  targets  of  interest.  Waveforms,  to  a 
greater  or  lesser  extent,  smear  the  clutter  into  the  target  region, 
thus  reducing  detectability.  Once  a  track  has  been  established 
on  a  target,  the  appropriate  choice  of  waveform  can  be  based 
on  the  track  state  estimates.  This  problem  has  been  examined 
in  works  such  as  [3],  [17],  [9],  [14],  [8], 

The  problem  we  are  considering  in  this  paper  is  the  detec¬ 
tion  of  new  targets,  so  the  results  in  the  works  cited  above  are 
not  directly  applicable.  The  efficient  search  for  new  targets 
has  been  examined  in  [19]  and  [18]  and  an  overview  of  these 
results,  and  related  work,  can  be  found  in  [2,  Ch  14].  While 
these  works  provide  guidelines  for  parameter  selection  in  a 
variety  of  cases,  they  do  not  pose  the  problem  adaptively. 
Clutter  mapping  and  optimal  scheduling  would  permit  the 
tailoring  of  waveforms  and  beam  shapes  to  best  match  the 
working  environment  of  the  radar.  These  techniques,  alone  or 
in  combination,  offer  the  possibility  of  adaptive  adjustment 
of  the  sensor  modes  to  optimise  performance.  Because  of  the 
high  data  rates,  manual  optimisation  of  the  performance  of 
a  modern  radar  by  pulse  tailoring  is  not  possible.  There  is 
significant  potential  for  improvement  in  new  target  detection  if 
adaptive  waveform  selection  is  considered  part  of  the  detection 
process. 

The  simplest  schemes  for  adaptive  waveform  management 
ascribe  a  cost  function  to  the  clutter/target  environment  for 
each  individual  pulse  and  select  the  waveform  that  optimises 
the  cost  function  on  a  pulse  by  pulse  basis.  While  such  a 
“greedy”  scheme  would  radically  improve  performance  over 
conventional  fixed  waveform  radars,  more  can  be  gained  by 
scheduling  waveforms  over  a  number  of  pulses,  so  as  to 
optimise  the  sum  of  the  costs  over  these  pulses. 

In  this  paper  we  pose  the  adaptive  waveform  scheduling 
problem  for  new  target  detection  as  a  stochastic  dynamic  pro¬ 
gramming  problem  of  the  type  known  as  a  partially  observed 
Markov  decision  problem.  Solutions  of  this  type  of  problem  are 
optimal  control  policies  that  maximise  an  objective  function. 
In  this  framework,  the  adaptive  waveform  selection  problem 
for  target  detection  becomes  the  selection  of  which  sequence 
of  waveforms  to  use  to  maximise  the  overall  rewards  of  target 
detection.  These  rewards  could  take  into  account  factors  such 
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as  timeliness  of  detection  or  target  importance  and/or  threat.  In 
addition,  the  location  of  targets  already  under  track  can  also  be 
incorporated  in  the  reward  structure,  as  well  as  a  clutter  map 
if  known,  to  prevent  radar  resources  being  wasted  on  known 
scatterers.  In  the  most  general  case,  the  overall  reward  for 
target  detection  could  be  a  combination  of  all  of  these  types 
of  factors. 

The  result  is  an  adaptive  waveform  selection  algorithm  that 
minimises  the  time  taken  to  detect  new  targets,  detecting  these 
targets  in  accordance  with  importance,  while  minimising  the 
use  of  radar  resources. 

II.  Problem  Outline 

Methods  for  waveform  selection  to  improve  both  detection 
and  tracking  performance  have  been  examined  in  [15],  [14] 
and  [9].  In  these  works,  the  problem  was  one  of  both  acquiring 
and  tracking  a  single  target  in  clutter.  In  this  paper  we  will 
only  consider  the  problem  of  detecting  new  targets,  rather 
than  tracking  established  targets.  Once  a  target  is  detected  and 
confirmed,  it  is  handed  over  to  the  track  update  process  and 
is  no  longer  a  concern  of  the  target  detection  process. 

We  pose  the  problem  of  optimal  adaptive  waveform  selec¬ 
tion  for  target  detection  as  a  finite  horizon  stochastic  control 
problem.  While  we  only  consider  detection  performance  here, 
this  approach  can  be  extended  to  consider  both  detection 
and  tracking.  The  problem  of  optimal  beam  scheduling  to 
maximise  tracking  performance  using  this  type  of  approach 
was  examined  in  [10]. 

The  problem  posed  in  [TO]  used  an  infinite  horizon  as  they 
were  concerned  with  tracking  performance.  Since  we  are  only 
considering  the  detection  problem  it  is  appropriate  to  use 
a  short,  finite  horizon.  Unlike  [10],  the  short  time  horizon 
allows  us  to  assume  that  the  scene  does  not  change  during 
this  interval,  i.e.  the  targets  do  not  move  appreciably.  This 
assumption  is  reasonable  as  the  number  of  dwells  used  to 
confirm  a  track  on  a  new  target  is  typically  very  small,  see 
for  example  [18]  and  [6]. 

In  addition,  we  will  consider  the  detection  problem  in  each 
radar  beam  to  be  independent  of  other  beams.  That  is,  a  target 
detection  in  one  beam  does  not  provide  any  information  on  the 
likelihood  of  a  detection  in  a  neighbouring  beam.  This  implies 
that  the  beams  are  spaced  with  minimal  overlap.  While  this 
is  not  always  true,  it  is  a  reasonable  assumption  and  provides 
a  useful  starting  point  for  developing  this  adaptive  waveform 
selection  method.  Therefore,  in  the  remainder  of  this  paper 
we  will  consider  the  detection  problem  in  a  particular  beam 
in  isolation. 

The  format  of  the  remainder  of  this  paper  is  as  follows. 
The  next  section  sets  up  the  problem  of  adaptive  waveform 
selection  for  target  detection  as  a  stochastic  control  problem. 
Section  IV  shows  how  the  effect  of  the  choice  of  waveform 
on  the  probability  of  detection  is  incorporated  into  the  model. 
Section  V  discusses  a  number  of  choices  for  the  objective 
function  that  is  to  be  maximised,  while  an  optimal  solution 
method  is  outlined  in  Section  VI. 


III.  Stochastic  Control  Problem 

We  divide  the  area  covered  by  a  particular  radar  beam  into  a 
grid  in  range-Doppler  space,  with  the  cells  in  range  indexed  by 
r  =  1, . . .  ,N  and  those  in  Doppler  indexed  by  v  =  1.*  , . .  ,  M. 
We  make  no  assumptions  about  the  number  of  targets  that  may 
be  present,  thus  the  number  of  possible  scenes  or  hypotheses 
about  the  radar  scene  is  2JVM.  Let  the  space  of  hypotheses 
be  denoted  by  T~L.  In  a  very  simple  example,  the  range  space 
could  be  divided  into  two  cells  (i.e.  a  target  is  either  near  or 
far)  and  Doppler  into  three  cells  (i.e.  a  target  is  either  receding, 
stationary  or  approaching)  then  the  set  of  hypotheses,  T~L,  has 
26  elements  corresponding  to  all  the  possibilities  ranging  from 
no  targets  present  to  6  targets  with  one  in  each  cell.  Note,  we 
assume  that  the  resolution  of  the  radar  and  the  size  of  the 
cells  are  such  that  at  most  one  scatterer  can  be  distinguished 
in  each  cell. 

The  state  of  our  model  is  then  x(k)  =  i  where  i  4  'H,  i.e. 
it  is  one  of  the  possible  scenes,  and  is  fixed  over  the  time 
interval  of  interest. 

The  radar  provides  noisy  measurements  y(k)  €  'H  of 
the  true  scene,  x  €  T~L  .  The  probability  of  receiving  a 
particular  measurement  y(k)  =  j  will  depend  on  both  the 
true,  underlying  scene  and  on  the  choice  of  waveform  used  to 
generate  the  measurement. 

Let  u(k )  be  the  control  variable  that  indicates  which  wave¬ 
form  is  chosen  at  time  k  to  generate  measurement  y(k  4-  1), 
where  u(k)  €  U.  Then  B (u(k))  =  (bji(u(k)))ij€-H  is  the 
measurement  probability  matrix  where 

bji(u(k))  =  Pr(y(k  +  1)  =  j\x  =  i,u(k)). 

In  other  words,  bji  (u)  is  the  probability  of  a  detection  in  all  the 
cells  considered  to  have  a  scatterer  present  under  hypothesis 
j  given  that  the  true  scene  is  given  by  hypothesis  i  and  is 
observed  with  waveform  u. 

Define  n  =  {u(0),  ti(l), . . .  ,u(T)}  where  T  +  1  is  the 
maximum  number  of  dwells  that  can  be  used  to  detect  and 
confirm  targets  for  a  given  beam.  Then  n  is  a  sequence  of 
waveforms  that  could  be  used  for  that  detection  process.  Let 

T 

V(x)  =  E[Y^c(x,u(k))] 

k= 0 

where  c(x,u(k ))  is  the  reward  earned  when  the  scene  x 
is  observed  using  waveform  u(k).  This  cost  function  will 
typically  express  the  capacity  of  the  waveform  to  discriminate 
potential  targets  in  the  particular  clutter  environment  expressed 
by  state  x.  Then  the  aim  of  our  problem  is  to  find  the  sequence 
7r*  that  satisfies 

T 

V* (x)  =  max E[\^  c(x,u(k))].  (1) 

7 r  *  ^ 

k= 0 

Now,  our  original  aim  was  to  design  an  optimal  waveform 
selection  algorithm  that  can  adapt  to  the  actual  state  of  the 
radar  scene.  However,  knowledge  of  the  actual  state  is  not 
available.  Instead,  we  only  have  access  to  noisy  measurements 
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of  the  scene.  To  handle  this,  let  Yk  =  (y(l),y(2), . . .  ,y(k )) 
and  Uk  =  (u(0),u(l), . . .  ,u(k))  and  then  define 

Pi(k)  =Pr{x  =  i\Yk,Uk~1) 


that  is,  the  vector  p (k)  is  the  conditional  density  of  the  state 
given  the  measurements  and  the  controls.  Using  Bayes’  rule 
and  the  Law  of  Total  Probability  the  following  recursion  can 
be  derived  for  p(fc  +  1)  [7],  [13] 


Pj(k  + 1)  =  Pr(x=j\Yk+1,Uk) 

b(y(k+i),j)  ( u(k))pj(k ) 

Y,i€-Hbv(k+l)AU(k))Pi(k) 

Let  B (j,  it)  be  the  matrix  with  the  vector  down 

the  diagonal  and  zeros  elsewhere,  then  in  matrix  notation 


=  B(y(k  +  l),u(k))p(k) 
1'B  (y(k  +  l),u(fc))p(fc) 


(2) 


where  1  is  a  column  vector  of  ones.  Note,  that  the  denominator 
of  (2)  is 


1'B  (y(k  +  1),  u(k))p(k)  =  Pr(y(k  +  1)|  Yk,Uk) 


which  is  not  a  function  of  p(fc). 

The  conditional  state  density  vector  p(fc)  is  also  known  as 
the  information  state  and  it  is  a  sufficient  statistic  for  the  true 
state  x.  That  is,  our  original  stochastic  control  problem  in 
terms  of  x  can  be  rewritten  in  terms  of  p  [1].  In  other  words, 
the  optimal  control  policy  n*  that  is  the  solution  of  (1)  is  also 
the  solution  of 

T 

^*(p(0))  =  max£[y'c(p(fc),u(fc))]  (3) 

7T  ‘  ^ 

k= 0 

where  p(0)  is  the  a  priori  probability  density  of  the  scene. 


IV.  Calculating  the  Measurement  Probabilities 

The  key  feature  of  this  model  for  waveform  selection  is  the 
manner  in  which  the  measurement  probabilities  vary  with  the 
choice  of  waveform.  Recall  that  the  measurement  probability 
bji(u(k ))  is 

bji(u(k))  =  Pr(y(k  +  1)  =  j\x  =  i,u(k)) 


which  is  the  probability  of  obtaining  the  detection  pattern 
described  by  hypothesis  j  when  the  true  scene  is  i  and  is 
measured  by  waveform  u(k). 

To  illustrate  how  the  measurement  probabilities  will  vary 
with  waveform  choice  consider  the  simple  case  when  there  is 
only  a  single  cell  in  range  and  three  in  Doppler  space.  These 
three  correspond  to  receding  targets,  (near)  stationary  targets 
and  approaching  targets.  The  set  of  possible  scenes  7~L  is  then 
i  =  1  no  targets  present 
i  =  2  a  single,  receding  target 
i  =  3  a  single,  stationary  target 
i=4  a  single,  approaching  target 
i  =  5  a  receding  target  and  a  stationary  target 
i  =  6  a  receding  target  and  an  approaching  target 
i  =  7  a  stationary  target  and  an  approaching  target 
i  =  8  a  receding,  a  stationary  and  an  approaching  target 


Suppose  the  true  scene  is  i  =  2,  i.e.  there  is  a  single,  receding 
target.  If  the  chosen  waveform  u(k)  has  relatively  good 
Doppler  resolution  then 

b22(u(k))  =  Pr(y(k  +  1)  =  2\x  =  2,  u(k)) 

will  be  high.  However,  the  hypotheses  that  correspond  to 
scenes  in  which  other  targets  are  present,  as  well  as  a  receding 
target,  will  also  be  moderately  high  as  noise  and  clutter  may 
produce  detections  in  the  other  cells  also.  That  is,  bj2(u(k )) 
for  j  =  5,6,8  will  be  significant.  The  remaining  hypotheses, 
bj2(u(k))  for  j  =  1,  3, 4,  7  all  do  not  contain  a  receding  target 
so  they  will  be  the  least  likely  as  they  require  that  the  true 
target  is  not  detected  and  that  clutter  or  noise  produces  a  false 
return  in  another  cell. 

On  the  other  hand,  suppose  the  chosen  waveform  has  very 
poor  Doppler  resolution  then  we  might  find  that 


for  all  j ,  i.e.  the  waveform  provides  no  useful  information 
about  the  scene. 

V.  Obiective  Function 

We  will  consider  two  basic  forms  for  the  objective  function 
at  time  k,  c(p(k),u(k)),  in  the  adaptive  radar  scheduling 
problem.  The  first,  more  simple  function  is 

c(p(fc),w(fc))  =  ^2  Pi(k)log(pi(k))  (4) 

ieu 

where  Pi(k)  \og(pi(k))  is  set  to  0  when  Pi(k)  is  sufficiently 
small.  This  function  is  at  a  maximum  when  the  entropy  is 
minimised.  Therefore,  maximising  this  objective  function  will 
produce  a  sequence  of  waveforms  that  determines  the  scene 
as  accurately  as  possible  over  the  allowed  number  of  dwells. 
The  inclusion  of  a  discounting  factor,  7*,  0  <  7  <  1,  in 
the  value  function  (3)  would  modify  the  solution  so  that  the 
most  accurate  estimate  of  the  scene  was  obtained  as  quickly 
as  possible. 

The  second  form  of  the  objective  function  allows  the  inclu¬ 
sion  of  information  such  as  a  known  clutter  map  and  allows 
possible  targets  to  be  classified  according  to  their  importance. 
In  this  case 

c(p(k),u(k))  =  E^(fc)logfe(fc))EcuA,W  (5) 

iEl-L  r,  v 

where  STtV(i)  =  1  if  cell  (r,  n)  contains  a  scatterer  under 
hypothesis  i  and  cr,„  is  a  weight  on  the  importance  of 
detecting  a  scatterer  in  that  cell.  For  cells  that  correspond 
to  areas  of  known  clutter  or  “uninteresting”  targets  such  as 
receding  targets  that  are  far  from  the  radar,  this  weight  would 
be  small.  For  “interesting”  targets  such  as  those  at  close  range 
and/or  approaching  the  radar  this  weighting  factor  would  be 
large.  The  use  of  the  second  form  of  the  objective  function 
in  (3)  yields  the  optimal  sequence  of  waveforms  to  detect 
the  targets  as  accurately  as  possible  with  the  more  important 
targets  detected  more  quickly  and  accurately  than  those  of 
low  importance.  A  discount  factor  can  be  included  if  it  is  also 
desired  that  the  detection  be  performed  as  quickly  as  possible. 
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VI.  Solution  Method 


VII.  Example 


The  stochastic  control  problem  given  by  equations  (3)  and 
(2)  belongs  to  the  class  of  problems  known  as  Partially 
Observed  Markov  Decision  Problems  (POMDP).  An  overview 
of  these  types  of  problems  and  methods  for  solving  them  can 
be  found  in  [13]  and  [12].  There  are  a  number  of  algorithms 
for  finding  both  optimal  and  near-optimal  solutions  to  these 
types  of  problems  over  a  finite  horizon.  A  survey  of  optimal 
algorithms  for  finite  horizon  problems  can  be  found  in  [4]. 

The  various  algorithms  make  use  of  the  fact  that  over  a  finite 
horizon  the  objective  function  of  a  POMDP  is  piecewise  linear 
and  convex.  Thus  the  objective  function  for  a  given  horizon, 
can  be  represented  by  a  set  of  vectors.  The  various  solution 
methods  provide  different  ways  of  finding  this  set  of  vectors. 
As  the  size  of  the  adaptive  waveform  selection  problem  is 
large,  we  will  use  the  highly  efficient  refinement  of  the  Witness 
algorithm  [11]  known  as  Incremental  Pruning  [5], 

The  dynamic  programming  algorithm  [1]  shows  that  the 
solution  of  the  problem  (3)  can  be  found  by  proceeding 
backwards  with  the  recursion 


Vt(pCO)  =  c(p(T)) 

Vk(p(k))  =  min  E[c(p(k),u(k))  + 

u(k)EU 

Vk+i(pUy^+1)(k  +  i))],A  =  T-  1,...  ,0 

where  p “  is  the  solution  of  (2)  when  u  is  used  at  dwell  k  +  1, 
generating  observation  y.  The  terminal  cost  c(p(T)),  does  not 
depend  on  u  since  no  further  scans  are  made.  The  key  result 
here  is  that  is  it  possible  to  define  a  new  objective  function 
V  in  terms  of  a  given  objective  function  V.  That  is,  we  can 
write 


V(p)  =  max  I  c(p,  u)  +  V  Pr(y\p,u)V(p“) 

ueu  \  u 

yen 


(6) 


In  [5]  it  is  shown  that  this  can  be  broken  into  a  series  of 
simpler  combinations  of  other  objective  functions 


V  (p)  =  maxT"(p) 

uEU 

vu(p)  =  £V“(p) 

y 

vy{  P)  =  |^|C(p,u)+Pr(y|p,u)V,(p“). 

Over  a  finite  horizon,  the  objective  function  V  can  be  ex¬ 

pressed  as  V(p)  =  maxcgs  p'a  for  some  finite  set  of  vectors 
S.  This  means  that  is  it  possible  to  write 


K“(P)  =  maxp'o 

y 

Vu(p)  =  max  p'a 

at£Su 

V(p)  =  max  p'a 

for  some  finite  sets  of  vectors  <S“,  Su  and  S  for  all  u  €  U  and 
y  £  T~L.  These  sets  have  a  unique  representation  of  minimum 
size  and  [5]  provides  an  efficient  method  for  generating  these 
sets  given  S. 


In  order  to  calculate  bji(u(k ))  we  need  to  make  some 
assumptions  about  the  likelihood  of  a  detection  in  a  cell  when 
there  are  scatterers  in  nearby  cells.  A  common  assumption  in 
tracking  in  clutter  problems  is  that  a  target  located  in  a  given 
cell  will  not  effect  a  detection  in  any  other.  While  this  is  an 
overly  idealised  assumption,  we  will  use  it  here  as  it  allows 
ready  comparison  with  other  work  in  this  area. 

A.  Information  State  Recursion 

Under  the  independence  assumption  above,  the  probability 
of  target  existence  in  a  cell  now  only  depends  on  detections  in 
the  cell  and  not  on  measurements  in  neighbouring  cells.  Thus 
the  original  information  state  recursion  (2)  (which  is  a  vector 
of  length  2nm )  reduces  to  N M  independent  scalar  equations. 
Define 

PrAk)  =  Pr(trAYk  ,^“1) 

where  eT)„  is  the  event  that  there  is  a  target  in  cell  (r,  v).  In 
other  words,  pTj„(fc )  is  the  probability  of  the  existence  of  a 
target  in  cell  (r,  v)  given  all  the  information  available  up  to 
time  k.  Let  yT,v(k  +  1)  be  the  measurement  in  the  cell  (r,  v) 
at  time  + 1  (i.e.  either  a  detection,  dT,v  or  no  detection,  dT,„) 
then  the  recursion  for  pT)„  is 

Pr,u{k  T  1)  —  —Pr(yT}„(k  T- 1 ) | cT^v^u{ky)pT^v{Jf) 
where 

A  —  Pr(dT^i, | €7-,^ ;  )Pr,i/ (^0  T 

Pt’^dr^i/ \eT)i/ ,  u{kdj )  (1  pT)Iy(A?)) 

and  IT,V  is  the  complementary  event  that  there  is  no  target  in 
cell  (r,  v). 

B.  Measurement  Probabilities 

Let  the  probability  of  a  detection  in  cell  (r,  v) 
Pr(dT)V\eT)V,  u)  =  Pj  ,v{u)  and  the  probability  of  a  false 
alarm  Pr(dT^\eT^,u)  =  PTfv{u).  To  calculate  P^’v(u)  and 
PTfv (u)  we  use  the  receiver  model  described  in  [15].  In  this 
model,  all  targets  have  a  Swerling  1  distribution  and  the 
noise  is  additive,  white  and  Gaussian  with  known  power.  The 
extension  to  other  models  is  straightforward. 

Under  the  model  of  [15],  when  there  is  no  target  present, 
the  output  of  the  matched  filter  receiver  is  a  complex  Gaussian 
random  variable  with  zero  mean  and  variance  given  by 

a0  =  2iVo£ 

where  No  is  the  known,  ambient  noise  power  and  £  is  the 
energy  of  the  transmitted  pulse.  For  convenience,  we  will 
assume  £  is  the  same  for  all  possible  waveforms  as  was  done  in 
both  [15]  and  [9],  although  this  is  not  required  by  our  model. 

The  matched  filter  output  in  a  cell  centred  on  (to,  ufj  when 
the  target  return  has  an  actual  time  delay  of  r  and  Doppler 
shift  of  u  is  still  zero  mean  and  Gaussian,  however  the  variance 
is  given  by 

of  =  2 Not;  +  2a^2^4(r0  -  t,v0  -  v) 
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where  a\  is  the  variance  of  the  amplitude  of  the  target  return 
and  A  is  the  ambiguity  function.  The  ambiguity  function 
specifies  the  output  of  the  matched  filter  in  the  absence  of 
noise.  It  is  given  by  the  equation  [16] 


A(t,  v)  = 


(J>(A)I  W 


/ 


s{\)s*{\-T)e2vjvXd\ 


where  s(t )  is  the  transmitted  baseband  signal. 

Recall,  the  magnitude  square  of  a  complex  Gaussian  random 
variable  y  ~  Af(0.  a2)  is  exponentially  distributed,  with  the 
density 


so,  if  there  is  no  target  in  the  cell  centred  on  (r,  v)  under 
hypothesis  i  and  the  detection  threshold  is  D  then 


TDT,v 


(«)  = 


f 


srxp<4)<i' 


=  <*P(j3) 


=  pf 


for  all  u  since  the  energy  of  the  transmitted  pulse  is  assumed 
to  be  the  same  in  all  cases. 

In  the  case  when  a  target  is  present  in  a  cell,  assuming  its 
actual  location  in  the  cell  has  a  uniform  distribution 


tjt  ,v 
rd 


(«) 


1  f  1  —x 

TTT  /  /  ^2  eM^)dxdTadu( 

\A\  J(ra,uaeA)  JD  2(7 1  2(71 

If  -D 

TTT  /  exp(— —)dTadva 

\A\  J(ra,vaeA)  2(7l 


where  A  is  the  resolution  cell  centred  on  (r,  v)  with  volume 

W- 


C.  Objective  Function 

Under  the  independence  assumption,  it  can  be  shown  that 
the  first  form  of  the  objective  function 

c(p(k),u(k))  =  ^ ~^Pi{k)\og{pi{k )) 

lEH 

reduces  to 

c(p  (k),u(k))  =  ^2{pT,v(k)log(pTiV(k))  + 

r,  i/ 

(1  -PrAk))  l0g(l  ~Pr,v(k))}. 

The  second  form  of  the  objective  function  discussed  in  Section 
V  becomes 

c(p(k),u(k))  =  52  {cT,vpT,v{k)  log iPr,v{k))  + 

T,I/ 

(1  - pT,v(k ))  log(l  -  Pr,v{k))} 

where  cT)„  is  a  weighting  factor  that  reflects  the  importance 
of  detecting  a  scatter  in  cell  (r,  v) . 


VIII.  Conclusion 

By  posing  the  target  detection  problem  as  a  stochastic  dy¬ 
namic  programming  problem  we  are  able  to  produce  schemes 
for  optimal  waveform  selection  over  a  finite  horizon.  We  are 
also  able  to  develop  a  flexible  framework  that  can  be  extended 
in  a  number  of  ways.  These  include  changes  to  the  way  the 
detection  probabilities  are  calculated  to  remove  the  idealised 
assumption  that  nearby  scatterers  do  not  interfere  with  one 
another.  This  framework  can  also  be  extended  to  consider 
tracking  as  well  as  detection  performance. 

References 

[1]  D.  Bertsekas.  Dynamic  Programming  and  Optimal  Control ,  volume  1. 
Athena  Scientific,  Belmona,  MA,  USA,  2nd  edition,  2001. 

[2]  S.  Blackman  and  R.  Popoli.  Design  and  Analysis  of  Modern  Tracking 
Systems.  Artech  House,  Boston,  USA,  1999. 

[3]  W.  D.  Blair.  Toward  the  integration  of  tracking  and  signal  processing 
for  phased  array  radar.  In  Proc.  of  SPIE  conference  on  Signal  and  Data 
Processing  of  Small  Targets ,  volume  2235,  pages  303-316,  Orlando, 
Florida,  USA,  1994. 

[4]  A.  R.  Cassandra.  Optimal  policies  for  solving  partially  observed  Markov 
decision  processes.  Technical  Report  CS-94-14,  Brown  University, 
Providence,  RI,  USA,  1994. 

[5]  A.  R.  Cassandra,  M.  L.  Littman,  and  N.  L.  Zhang.  Incremental  pruning: 
A  simple,  fast,  exact  method  for  partially  observable  Markov  decision 
processes.  In  Proc.  of  13th  Annual  Conference  on  Uncertainty  in 
Artificial  Intelligence  (UAI-97),  Providence,  Rhode  Island,  USA,  1997. 

[6]  R.  A.  Dana  and  D.  Moraitis.  Probability  of  detecting  a  Swerling  I 
target  on  two  correlated  observations.  IEEE  Trans,  on  Aerospace  and 
Electronic  Systems,  17(5):727-730,  1981. 

[7]  R.  J.  Elliott,  L.  Aggoun,  and  J.  B.  Moore.  Hidden  Markov  Models: 
Estimation  and  Control.  Springer- Verlag,  New  York,  USA,  1995. 

[8]  S.-M.  Jong  and  Y.-H.  Jung.  Optimal  scheduling  of  track  updates  in 
phased  array  radars.  IEEE  Trans,  on  Aerospace  and  Electronic  Systems, 
34(3):  1016-1022,  1998. 

[9]  D.  J.  Kershaw  and  R.  J.  Evans.  Waveform  selective  probabilistic 
data  association.  IEEE  Trans,  on  Aerospace  and  Electronic  Systems, 
33(4):  11 80-1 188,  1997. 

[10]  V.  Krishnamurthy  and  R.  J.  Evans.  Hidden  Markov  model  multiarm 
bandits:  A  methodology  for  beam  scheduling  in  multitarget  tracking. 
IEEE  Trans,  on  Signal  Processing,  49(12):2893-2908,  2001. 

[11]  M.  L.  Littleman.  The  Witness  algorithm  for  solving  partially  observed 
Markov  decision  processes.  Technical  Report  CS-94-40,  Brown  Univer¬ 
sity,  Providence,  RI,  USA,  1994. 

[12]  W.  S.  Lovejoy.  A  survey  of  algorithmic  methods  for  partially  observed 
markov  decision  processes.  Annals  of  Operations  Research,  28:47-66, 
1991. 

[13]  G.  E.  Monahan.  A  survey  of  partially  observable  markov  decision  pro¬ 
cesses:  Theory,  models  and  algorithms.  Management  Science,  28(3):  1— 
16,  1982. 

[14]  R.  Niu,  P.  Willett,  and  Y.  Bar-Shalom.  Tracking  considerations  in 
selection  of  radar  waveform  for  range  and  range-rate  measurements. 
IEEE  Trans,  on  Aerospace  and  Electronic  Systems,  38(2):467^487,  2002. 

[15]  C.  Rago,  P.  Willett,  and  Y.  Bar-Shalom.  Detection-tracking  performance 
with  combined  waveforms.  IEEE  Trans,  on  Aerospace  and  Electronic 
Systems,  34(2) :6 12-624,  1998. 

[16]  M.  I.  Skolnik.  Introduction  to  Radar  Systems.  McGraw-Hill,  3rd  edition, 
2001. 

[17]  D.  Stromberg.  Scheduling  of  track  updates  in  phased  array  radars.  In 
Proc.  of  1996  National  Radar  Conference,  pages  214—219,  Ann  Arbor, 
Michigan,  USA,  1996. 

[18]  G.  V.  Trunk,  J.  D.  Wilson,  and  P.  K.  Hughes,  II.  Phased  array 
parameter  optimization  for  low-altitude  targets.  In  Proc.  of  IEEE  Radar 
Conference,  pages  196-200,  Washington,  D.C.,  USA,  1995. 

[19]  G.  van  Keuk  and  S.  S.  Blackman.  On  phased  array  radar  tracking  and 
parameter  control.  IEEE  Trans,  on  Aerospace  and  Electronic  Systems, 
29(1):  186-194,  1993. 


496 


